Investigating the Effects of Dynamic Precision Scaling on Neural Network Training
نویسندگان
چکیده
Training neural networks is a timeand compute-intensive operation. This is mainly due to the large amount of floating point tensor operations that are required during training. These constraints limit the scope of design space explorations (in terms of hyperparameter search) for data scientists and researchers. Recent work has explored the possibility of reducing the numerical precision used to represent parameters, activations, and gradients during neural network training as a way to reduce the computational cost of training (and thus reducing training time) [1][2]. In this paper we develop a novel dynamic precision scaling scheme and evaluate its performance, comparing it to previous works. Using stochastic fixedpoint rounding, a quantization-error based scaling scheme, and dynamic bit-widths during training, we achieve 98.8% test accuracy on the MNIST dataset using an average bit-width of just 16 bits for weights and 14 bits for activations. This beats the previous state-of-the-art dynamic bit-width precision scaling algorithm.
منابع مشابه
Applying GMDH artificial neural network to predict dynamic viscosity of an antimicrobial nanofluid
Objective (s): Artificial Neural Networks (ANN) are widely used for predicting systems’ behavior. GMDH is a type of ANNs which has remarkable ability in pattern recognition. The aim the current study is proposing a model to predict dynamic viscosity of silver/water nanofluid which can be used as antimicrobial fluid in several medical purposes.Materials and Methods: In order to have precise mode...
متن کاملA Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images
Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise chang...
متن کاملNeural Network Prediction of Warm Deformation Flow Curves in Ferrite+ Cementite Region
Many efforts have been made to model the the hot deformation (dynamic recrystallization) flow curves of different materials. Phenomenological constitutive models, physical-based constitutive models and artificial neural network (ANN) models are the main methods used for this purpose. However, there is no report on the modeling of warm deformation (dynamic spheroidization) flow curves of any kin...
متن کاملRobust Fault Detection on Boiler-turbine Unit Actuators Using Dynamic Neural Networks
Due to the important role of the boiler-turbine units in industries and electricity generation, it is important to diagnose different types of faults in different parts of boiler-turbine system. Different parts of a boiler-turbine system like the sensor or actuator or plant can be affected by various types of faults. In this paper, the effects of the occurrence of faults on the actuators are in...
متن کاملIdentify enablers of agility and agile modeling strategy with neural network approach
The electronic industry suffers a rapid changing and highly rival environment. Thus, firms have an essential need to strive for acquiring the competitive advantage. Strategy Organizational Agility (SOA) is a tool which enables to assist firms to attain the competitive advantage. Therefore, this study benchmarks the core competencies from a case study within the supply chain network and establis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1801.08621 شماره
صفحات -
تاریخ انتشار 2018